|
Reliability, availability, and serviceability (RAS) is a computer hardware engineering term involving reliability engineering, high availability, and serviceability design. The phrase was originally used by International Business Machines (IBM) as a term to describe the robustness of their mainframe computers.〔. "The acronym RAS (reliability, accessibility and serviceability) came into widespread acceptance at IBM as the replacement for the subset notion of recovery management."〕〔- "The dependability () experienced by other System/370 users is the result of a strategy based on RAS (Reliability-Availability-Serviceability)"〕 Computers designed with higher levels of RAS have many features that protect data integrity and help them stay available for long periods of time without failure〔(【引用サイトリンク】author=Sam Siewert )〕 — this data integrity and uptime is a particular selling point for mainframes and fault-tolerant systems. ==Definitions== While RAS originated as a hardware-oriented term, systems thinking has extended the concept of reliability-availability-serviceability to systems in general, including software.〔 For example: 〕 * ''Reliability'' can be defined as the probability that a system will produce correct outputs up to some given time ''t''. Reliability is enhanced by features that help to avoid, detect and repair hardware faults. A reliable system does not silently continue and deliver results that include uncorrected corrupted data. Instead, it detects and, if possible, corrects the corruption, for example: by retrying an operation for transient (soft) or intermittent errors, or else, for uncorrectable errors, isolating the fault and reporting it to higher-level recovery mechanisms (which may failover to redundant replacement hardware, etc.), or else by halting the affected program or the entire system and reporting the corruption. Reliability can be characterized in terms of mean time between failures (MTBF), with reliability = exp(-t/MTBF).〔 * ''Availability'' means the probability that a system is operational at a given time, i.e. the amount of time a device is actually operating as the percentage of total time it should be operating. High-availability systems may report availability in terms of minutes or hours of downtime per year. Availability features allow the system to stay operational even when faults do occur. A highly available system would disable the malfunctioning portion and continue operating at a reduced capacity. In contrast, a less capable system might crash and become totally nonoperational. Availability is typically given as a percentage of the time a system is expected to be available, e.g., 99.999 percent ("five nines"). * ''Serviceability'' or ''maintainability'' is the simplicity and speed with which a system can be repaired or maintained; if the time to repair a failed system increases, then availability will decrease. Serviceability includes various methods of easily diagnosing the system when problems arise. Early detection of faults can decrease or avoid system downtime. For example, some enterprise systems can automatically call a service center (without human intervention) when the system experiences a system fault. The traditional focus has been on making the correct repairs with as little disruption to normal operations as possible. Note the distinction between reliability and availability: reliability measures the ability of a system to function correctly, including avoiding data corruption, whereas availability measures how often the system is available for use, even though it may not be functioning correctly. For example, a server may run forever and so have ideal availability, but may be unreliable, with frequent data corruption.〔 〕 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Reliability, availability and serviceability (computing)」の詳細全文を読む スポンサード リンク
|